The full emotional effect of a movie is mainly based on the music played, in combination with the visual information. Typical movie genres are romance, comedy, horror and action.
If you close your eyes and think about
Emotions are key elements in movies. When we think of a particular movie we’ve seen, it doesn’t take much to remember certain types of songs played in the movie. Different types of melodies, keys, instruments and many more aspects can produce a very different response in our brain.
In romantic dramas, emotions like ‘loving’ en ‘sense of longing’ are the main characteristics, but sometimes ‘sadness’ also play, a role, especially in romantic dramas. In horror movies, fear and anxiety are the main emotions expressed in music. Dark overtones will be present. In action movies, emotions with high intensity like excitement are essential. In feelgood-comedy movies, ‘happiness’ and ‘joy’ are the main emotions, so these tracks will contain a lot of musical elements from ‘happy’ music like major tones.
The corpus for this portfolio covers a presentable selection of typical movies within the four movie genres. This selection is based on the movie genre categorization of the Internet Movie Database (IMDB). Movies within an IMDB genre with typical features from other genres were excluded, as well as ‘dialogue’ tracks from the Spotify albums. The Romance/Drama playlist contains 331 tracks (11 movies), the Feelgood/Comedy playlist contains 235 tracks (16 movies), the Horror playlist has 237 tracks (9 movies) and the Action playlist has a total of 223 tracks (12 movies). Only spotify albums from the ‘Official Motion Picture Soundtrack’ were selected.
In this portfolio there are multiple levels of analysis. First of all, individual tracks from the corpus are being analyzed with for example chromagrams, keygrams, cepstrograms, etc. After that, some analysis of the .. and lastly, a prediction analysis is done over the whole corpus.
What are the expectations?
A horror movie will be defined as a movie that seeks to scare or unsettle the audience. It is expected that this music is mostly written in the minor key. Minor chords are typically associated with sadness and melancholy. Music in horror movies wobbles and sound deliberately out of tune. For example, a lot of glissandi on violins (the screening upward). Pitch will be destabilized and pitch drops are used to stress the ‘unexpected’. One of the most iconic sounds, is the sudden sforzando tutti crash, designed to shock the audience instantly. It happens often in the midst of a musical silence, or after a pedal note.
Romantic dramas, generally, contain both ‘loving feelings’ and ‘sadness’, that is why this genre probably will oscillate between music written in both the major and minor key.
The action movie offers thrills (e.g. shooting) and spectacle (e.g. explosions).
Comedy movies are overall very happy. Happy tunes are written in the major key, are louder than other genres and probably more danceable with a high valence.
What is the corpus of this portfolio?
Select movie genre:
Mode 0 = minor, mode 1 = major
This graphic shows the emotional quadrant of tracks played in movies from movie genres Horror, Action, Feelgood/Comedy and Romance/Drama with color representing the mode and size the loudness of the track Valence describes the musical positiveness, energy describes the arousal. There is a clear distinction of the Horror and Action genres from the other two, the tracks are mainly displayed on very low valence values. Most of the tracks from Horror movies are concentrated in the Depressing/Sad section of the emotional quadrant graph, with a lot of minor songs and these are overall very ‘quiet’, which is represented by the size of the dots. The bigger the dot, the louder the song. Tracks from Action movies are more smeared out, but locate mainly in the Angry/Turbulent and Depressing/Sad sections of the graph. Surprisingly, the majority of these tracks do not seem to be very loud at all, which is in contrast with the expectations. Feelgood-Comedy tracks are more scattered, but in comparison with the other genres, this genre has a lot of tracks in the Happy/Joyful section with more louder songs, which is in line with the expectations. The tracks of Romantic/Drama’s are localized throughout the whole plot, but do have a low valence overall.
Select specific movies:
| category | mode | median |
|---|---|---|
| Action | Major | 117.5940 |
| Action | Minor | 98.8900 |
| Feelgood/Comedy | Major | 125.0285 |
| Feelgood/Comedy | Minor | 115.9900 |
| Horror | Major | 88.5990 |
| Horror | Minor | 90.3370 |
| Romance/Drama | Major | 110.0880 |
| Romance/Drama | Minor | 101.9310 |
On average, feelgood/comedy movies have a higher BPM than the other three genres (+/- 120 BPM). The average BPM for horror movies is the lowest (around 90 BPM). The distribution for romance/drama and action is about the same. It is also clear that overall, songs in minor key do have a slightly lower BPM than major-key songs, especially in action movies.
In a self-similarity matrix each element of the feature sequence is being compared with all other elements. In this graphic, the x and y axes both represent the song ‘Feelgood’, a typical Feelgood/Comedy track (what else!). Path-like structures represent exact repetitions. There is one main diagonal visible, this is because both axis represent the exact same song. Block-like structures represent homogeneous regions. This is where music features stay somewhat constant over the duration of an entire musical part.
The visualization on the left represents a self-similarity matrix for ‘chroma’. It demonstrates at which points in the track the same pitches occur. The right visualisation represents the same song, but with ‘timbre’, also referred to as ‘tone color’. Later on there will be more information about this musical feature.
In the self-similarity matrix of timbre is clear that there are two parts that are different in timbral components, this is made visible with red vertical lines.
As you may have noticed, these time segments correspond to the segments of the cepstrogram of the the previous page. This means that these represent the same riffs.
Another dance is a track from the Romantic/Drama ‘Pride and Prejudice’. In this example only diagonal lines are visible, but no block-like structures. Lines that are diagonal even to the main diagonal mean exact repetitions. It is very clear when listening to this track.
Both left graphics represent chromagrams. These sum up all pitch coefficients that belong to the same chroma, so this gram cyclic in nature. The y-axis displays the pitches. The graphics to the right represent keygrams with keys displayed on the y-axis. In both grams the x-axis is the duration of the track
You look at two different tracks: * ‘Curse Your Name’; a Horror track from the movie ‘The Lighthouse’. This track is written in the minor key and has an extremely low valence (0.021). This track has both sections with long lasting pitches and sections with a lot of pitches played at the same time which makes it very typical for Horror film music. These types of sections are accentuated with white vertical lines.
This chord is the G#major(9) chord. The chromagram of Curse Your Name is more smeared out. A lot of different tones are played at the same time, as well as very high and very low pitched sounds, throughout the whole track. This track is played in a minor key and sounds a bit out of tune. This is very typical for horror film music, because a lot of tones are played at the same time, even if they do not sound harmonically ‘correct’ to the human ear.
Timbre, also known as “tone color”, is the perceived sound quality of a musical note or sound. It distinguishes various types of musical instruments. There are twelve timbre coefficients in total. the values are high level abstractions of the spectral surface ordered by degree of importance. The first coefficient represents the ‘average loudness’, the second points out ‘brightness’, the third is more closely related to the ‘flatness’ of a sound and the fourth to sounds with a ‘stronger attack’. Increased levels of mid and high frequency content are referred to as ‘brighter’. A high flatness indicates that the spectrum has a similar amount of power in all spectral bands (i.e. similar to white noise) and low flatness indicates a “spiky” spectrum (mixture of sine waves).
From the graph of Doll Box, it is very clear which timbre components are being used in this track; c02, c03 and c04. The sound of this track represents the typical sound of a doll music box. Because these three components are very constant throughout the whole track, it is hard to distinguish the different sound characteristics. Two clear spots in this cepstrogram, are the very contrasting parts at t = 1 and at t = 32. These parts are riffs on a copper xylophone.
As you can see, the timbral components of Curse Your Name are much more spread out in the cepstrogram. This is very typical in horror music, because horror music tends to have a lot of different musical characteristics/instruments played at the same time, that seeks to give the audience an uncomfortable and unsettling feeling. The first bright yellow part of co2 is very distinguishable in this track. It represents a wind-instrument, probably a trumpet. Immediately after this part, a somewhat longer c01 appears. This part is a very low-pitched string-instrument, it sounds like a string bass. The yellow part of c05 is a very sharp sound, a high-pitched flute which is very unpleasant to the ear. It is clear that both very high- and low pitched sounds are being used at the same time in typical horror music.
When comparing c03 in both tracks: this sound is in both tracks a very high-pitched instrument, however, in doll house this sound is made by a percussion instrument. In Curse Your Name, this sound sounds more like a wind instrument.
For this analysis, the first 80 tracks from each genre was being used
When comparing the twelve Spotify timbre coefficients between the four movie genres, the main difference lies in the second and third coefficient. For c02, horror music really differs from the other genres. It seems that the range is way more in the positive area. This is probably due to the use of stringed instruments in horror music.
The analysis is performed over 120 tracks from each genre category
Genre classification with all four genres was performed using support vector machines in a ten-fold cross-validation test. Here, three confusion matrices from all features (track-level-, timbral- and chroma features), timbral features and chroma features are being compared. The darker the color grey, the better the prediction of the tracks.
*F-value: From the confusion matrices, it is clear that there is a clear diagonal dark-grey line, from the upper left to the bottom right. This means, that the model predicted these the best.
F-value:
F-value: Chroma features (keys) do not seem to predict movie genre very well. Music from Horror movies is the least distinguishable when looking at chroma features. However, true Action movies do distinguish much better from true Horror and Romance/Drama movies.
A striking observation is that from the true Horror movies, only 22% was correctly categorized as a ‘Horror track’. This percentage lays even below change level of 25%. This means that the chroma features of horror movies are very much like those of Romance/Drama tracks.
| class | precision | recall |
|---|---|---|
| Action | 0.5041322 | 0.5083333 |
| Feelgood/Comedy | 0.5963303 | 0.5416667 |
| Horror | 0.6168224 | 0.5500000 |
| Romance/Drama | 0.5104895 | 0.6083333 |
| class | precision | recall |
|---|---|---|
| Action | 0.4590164 | 0.4666667 |
| Feelgood/Comedy | 0.5000000 | 0.4500000 |
| Horror | 0.5925926 | 0.5333333 |
| Romance/Drama | 0.5352113 | 0.6333333 |
| class | precision | recall |
|---|---|---|
| Action | 0.3846154 | 0.4166667 |
| Feelgood/Comedy | 0.3059701 | 0.3416667 |
| Horror | 0.3626374 | 0.2750000 |
| Romance/Drama | 0.5200000 | 0.5416667 |
It is clear that overall, all features predict movie genre better than only timbral features. Feelgood/comedies are most often confused with Romantic dramas and Action for timbral features. This means that the timbral features of feelgood/comedies some what similar are to Romantic dramas and action movies. Timbral features from horror and feelgood/comedies are the most distant from each other.
From the forest model, it is very clear that there are some features that really characterize the four movie genres. The top 6 features are:
It seems that timbre c06 is an important component in distinguishing movie genres. However, because timbre components above c04 are very hard to identify, it is not very clear what aspect this exactly is.
Next, we’ll perform a new analysis on these 8 components that really differ
When looking only at the top 8 components from the previous page and compare it with the other:
| class | precision | recall |
|---|---|---|
| Action | 0.4800000 | 0.4000000 |
| Feelgood/Comedy | 0.5000000 | 0.5333333 |
| Horror | 0.5267857 | 0.4916667 |
| Romance/Drama | 0.4642857 | 0.5416667 |
| class | precision | recall |
|---|---|---|
| Action | 0.5041322 | 0.5083333 |
| Feelgood/Comedy | 0.5963303 | 0.5416667 |
| Horror | 0.6168224 | 0.5500000 |
| Romance/Drama | 0.5104895 | 0.6083333 |
| class | precision | recall |
|---|---|---|
| Action | 0.6404494 | 0.4750000 |
| Feelgood/Comedy | 0.5934959 | 0.6083333 |
| Horror | 0.6513158 | 0.8250000 |
| Romance/Drama | 0.6810345 | 0.6583333 |
From the decision tree, the precision and recall for the different genres are much better than the ones for the confusion matrices. Recall (sensitivity) is the amount of tracks that were correctly categorized to the right genre. The recall for the horror tracks is best, with a total accuracy of 75%. So 75% of the tracks that belong to Horror films, are correctly categorized. The precision and recall is lowest for Action movies.
The results support the notion that high intensity movies like action and horror, have musical cues that are measurably different from the scores of movies with more measured expression of emotion, like comedy and romance.
This small study presents a preliminary examination on a corpus of music collected from film scores in four genres, from a total of 48 movies (Action, Romance/Drama, Horror and Feelgood/Comedy) utilizing all kinds of music representations from track-level-features, to chroma and timbre self-similarity matrices, musical keys, and tempo.
From the emotional quadrant
Initial results suggests that the music from
However, even when using very distinct movie genres, it is clear that such a labeling scheme is likely too broad as several tracks within a specific genre may exhibit characteristics of music from another genre. For example, we’ve seen that music from feelgood/comedies are musically very much similar to these of action movies, when looking at timbral features.
A more close examination of each individual track will probably serve to improve classification accuracy.
The author
This portfolio was made by Iris Gaarthuis, I am psychology student who follows this course Computational Musicology as part of the Minor Kunstmatige Intelligentie. I especially wanted to choose a subject that was somewhat related to the field of psychology. That’s why I chose this subject! I hope you enjoyed reading this portfolio about film music and the emotional aspects of it.